-
Notifications
You must be signed in to change notification settings - Fork 13
Feature/fdb 327 fdb compare #165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
|
This was supposed to be a draft pull request.... |
684c81d to
722f460
Compare
722f460 to
38d2604
Compare
7b4aaf0 to
232d0b2
Compare
d6efd5d to
957b3c0
Compare
f6064e4 to
c814917
Compare
|
Pending tasks:
|
c814917 to
2daf3e2
Compare
…wo FDB no comparison or testing done yet
…nore certain key-value pairs
bcef5ed to
b643bc8
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## develop #165 +/- ##
===========================================
- Coverage 73.30% 72.66% -0.64%
===========================================
Files 363 377 +14
Lines 21956 22712 +756
Branches 2253 2383 +130
===========================================
+ Hits 16094 16504 +410
- Misses 5862 6208 +346 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
3508d48 to
1b031b4
Compare
| @@ -0,0 +1,7 @@ | |||
| ecbuild_configure_file( mismatch_mars.sh.in mismatch_mars.sh @ONLY ) | |||
|
|
|||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expect to see an ecbuild command getting test data (probably for more than one test), and then the test depending on it.
It isn't great to add meaningful-sized binary files to the git repo...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach is following the same approach as the regression tests.
I can avoid duplicating them. Or do you explicitly want them to be uploaded on get.ecmwf.int
| @@ -0,0 +1,7 @@ | |||
| ecbuild_configure_file( mismatch_grib.sh.in mismatch_grib.sh @ONLY ) | |||
|
|
|||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do the data.grib files differ between the different tests? It looks like you are grib setting them, to set upthe tests on the go. So we probably only need one grib file. You migt even be able to use one which is already referenced in a different test, which would avoid the extra data.
|
|
||
| FDBCompare(int argc, char** argv) : FDBVisitTool(argc, argv, "class,expver") { | ||
| options_.push_back(new SimpleOption<std::string>("test-config", "Path to a FDB config")); | ||
| options_.push_back(new SimpleOption<std::string>("reference-config", "Path to a second FDB config")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it important that there is a test and a reference? Is the compare not symmetric? I would have thought that --config-one and --config-two would suffice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comparison is symmetric, it's just about naming. Don't have an opinion on that. It's just that the convention is used at various places in the code right now.
| options_.push_back(new SimpleOption<std::string>("test-config", "Path to a FDB config")); | ||
| options_.push_back(new SimpleOption<std::string>("reference-config", "Path to a second FDB config")); | ||
| options_.push_back(new SimpleOption<std::string>( | ||
| "reference-request", "Mars key request for reference FDB entry (e.g reference experiment)")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is a "mars key request" that is not a phrase that makes sense. At the least this needs an example.
| Method parseMethod(const std::string& s); | ||
|
|
||
|
|
||
| struct Options { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is strange being in a file called Scope.
| /// @param test Test location of the message | ||
| /// @param opts Additional options | ||
| /// @return Comparison result. As mars is not comparing data, only matched is expected to be set | ||
| Result compareGrib(const DataIndex& ref, const DataIndex& test, const Options& opts); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I must admit that structurally, I would have written this with a comparator class (initialised with Options). And then have the different types of comparison be internal to that class (probably as member functions, but they could be in an anonymous namespace).
Roughly happy with this as is, but stylistically it is not really the same as the rest of the FDB/MARS code in that regard. I would err towards making it so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I explicitly removed a factory. The comparison calls on mars and grib are explicitly called in order. I don't see a reason to wrap a layer of indirection.
| } | ||
| else { | ||
| fdbtest = FDB(config(args)); | ||
| fdbref = FDB(config(args)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this re-assignment results in default-initialising (and constructing) two FDBs. Depending on the current environment that may fail (throwing an exception) before you try and construct the FDBs with the supplied configs. That would be an erronious failure mechanism.
I think that the easiest solution here is to create a function returning a pair<> of FDBs (or perhaps better, a single use struct containing two FDBs) such that you can avoid this.
| } | ||
|
|
||
| if (refReqString_) { | ||
| opts_.referenceRequest = parseKeyValues(*refReqString_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It isn't clear to me what a reference/test request is. And I am hardly a new user to this ecosystem. This needs some elaboration/commentary.
| } | ||
|
|
||
|
|
||
| // Return if only a comparison of Mars metadata messages was specified as Command Line option |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does only this line warrant a comment. There are bits that could do with explanatory code. Not convinced this is the most significant...
| if (opts_.scope == Scope::All) { | ||
| std::cout << "****************** SUMMARY ********************" << std::endl; | ||
| std::cout << gribRes << std::endl; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expect a compare tool to return success or failure depending on the output. This very much looks like it prints the output, but then returns zero. That makes it much harder to integrate into wider scripts.
simondsmart
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few extra details...
|
|
||
| for (const auto& [rk, rv] : r) { | ||
| auto search = l.find(rk); | ||
| if (search == r.end()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug here. search comes from l.find but is being compared against r.end. Iterators are not comparable between containers.
| } | ||
|
|
||
| if ((testReqString_ && !refReqString_) || (!testReqString_ && refReqString_)) { | ||
| throw UserError("Options --reference-request and --tests-request must either both be set or both be omitted.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be --test-request not --tests-request
|
|
||
| // EditionNumber will is always the 8th Byte in a Grib message | ||
| int editionnumberRef = (int)bufferRef[7]; | ||
| int editionnumberTest = (int)bufferRef[7]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both editionnumberRef and editionnumberTest are read from bufferRef, and so are equal by construction. Should be using bufferTest.
Please add a test to check that this is working.
|
|
||
| if (holdsVector(valRef)) { | ||
| if (holdsVector(valTest)) { | ||
| auto l1 = vectorLength(valRef); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that vectorLength returns bool. So this is only testing if they are both empty/both have data. Doesn't test lengths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh. good catch.
Description
Opening this as a draft pull request, while finishing the refactoring on the grib comparison but to start discussing the design and features.
This draft shows the direction in which I would like to continue refactoring.
This tool allows to compare two FDB or two experiments in one FDB. It uses the standard fdb tool mechanisms, minimal-keys, CLI request, but also allows to directly specify mars keys, where entries should be ignored. it uses fdb list to generate a request then sorts the results and compares them, depending on scope, where the default is just a comparison of the mars keys, without the messages itself, possibility of comparing grib-header only and then grib-header plus data section. Default is a key by key comparison,
More explanation and examples can be found here:
https://confluence.ecmwf.int/display/~ecm4053/FDB+Comparison
Contributor Declaration
By opening this pull request, I affirm the following:
🌈🌦️📖🚧 Documentation 🚧📖🌦️🌈
https://sites.ecmwf.int/docs/dev-section/fdb/pull-requests/PR-165
🌈🌦️📖🚧 Documentation Z3FDB 🚧📖🌦️🌈
https://sites.ecmwf.int/docs/dev-section/z3fdb/pull-requests/PR-165
🌈🌦️📖🚧 Documentation FDB 🚧📖🌦️🌈
https://sites.ecmwf.int/docs/dev-section/fdb/pull-requests/PR-165